12 research outputs found

    A Hybrid Rule-Based and Neural Coreference Resolution System with an Evaluation on Dutch Literature

    Get PDF
    We introduce a modular, hybrid coreference resolution system that extends a rule-based baseline with three neural classifiers for the subtasks mention detection, mention attributes (gender, animacy, number), and pronoun resolution. The classifiers substantially increase coreference performance in our experiments with Dutch literature across all metrics on the development set: mention detection, LEA, CoNLL, and especially pronoun accuracy. However, on the test set, the best results are obtained with rule-based pronoun resolution. This inconsistent result highlights that the rule-based system is still a strong baseline, and more work is needed to improve pronoun resolution robustly for this dataset. While end-to-end neural systems require no feature engineering and achieve excellent performance in standard benchmarks with large training sets, our simple hybrid system scales well to long document coreference (>10k words) and attains superior results in our experiments on literature

    RUG-1-Pegasussers at SemEval-2022 Task 3:Data Generation Methods to Improve Recognizing Appropriate Taxonomic Word Relations

    Get PDF
    This paper describes our system created for the SemEval 2022 Task 3: Presupposed Taxonomies - Evaluating Neural-network Semantics. This task is focused on correctly recognizing taxonomic word relations in English, French and Italian. We develop various data generation techniques that expand the originally provided train set and show that all methods increase the performance of models trained on these expanded datasets. Our final system outperforms the baseline from the task organizers by achieving an average macro F1 score of 79.6 on all languages, compared to the baseline's 67.4.</p

    Bewertung von Lebensmitteln verschiedener Produktionsverfahren - Statusbericht 2003

    Get PDF
    INHALTSVERZEICHNIS: 1 Einleitung 2 Zur Struktur der Studie 3 Qualität von Lebensmitteln nach Produktionsverfahren 3.1 Prozessqualität 3.1.1 Verfahrensweisen und Elemente der Prozessqualität bei der landwirtschaftlichen Erzeugung 3.1.1.1 Ökobilanzen 3.1.1.2 Vergleich der Produktionsverfahren für einzelne Umweltwirkungsbereiche 3.1.2 Prozessqualität unter besonderer Berücksichtigung der Verarbeitung 3.1.3 Prozessqualität der Erzeugnisse - Bewertung durch Verbraucherinnen und Verbraucher 3.1.4 Schlussfolgerungen, Empfehlungen und Forschungsbedarf im Bereich der Prozessqualität 3.2 Produktqualität 3.2.1 Vom Produktionsverfahren unabhängige Einflüsse auf die Produktqualität 3.2.2 Gesetzlich vorgeschriebene Qualität (Lebensmittelsicherheit) 3.2.3 Ernährungsphysiologische Qualität 3.2.4 Genusswert 3.2.5 Eignungswert 3.2.6 Schlussfolgerungen, Empfehlungen und Forschungsbedarf im Bereich der Produktqualität 4 Komplementäre Ansätze zur Erfassung der Lebensmittelqualitäten 4.1 Bildschaffende Methoden 4.2 Nachernteverhalten 4.3 Fluoreszenz-Anregungs-Spektroskopie 4.4 Physiologischer Aminosäurenstatus 4.5 Elektrochemische Methoden 4.6 Futterwahl und Fütterungsversuche 4.7 Konsequenzen für die Forschung zur Erfassung der Lebensmittel- qualität - Modellvorstellungen 5. Sozioökonomische Aspekte ökologisch erzeugter Lebensmittel in Deutschland 5.1 Ökologisch erzeugte Lebensmittel aus Verbrauchersicht 5.2 Auswirkungen von ökologischen Ernährungsstilen auf die Kosten im Gesundheitswesen und auf den Ressourcenverbrauch 5.2.1 Auswirkungen auf die Kosten im Gesundheitswesen 5.2.2 Auswirkungen auf die Kosten in den Bereichen Umwelt und Ressourcen 5.3 Nachhaltige Entwicklung im Bedürfnisfeld Ernährung 5.4 Aspekte des Marktes für ökologisch erzeugte Lebensmittel 5.5 Ökologisch erzeugte Lebensmittel in der Gemeinschaftsverpflegung (GV) 5.6 Schlussfolgerungen, Empfehlungen und Forschungsbedarf zu sozioökonomischen Aspekten bei Bio-Lebensmitteln 6 Schlussfolgerungen, Empfehlungen und Forschungsbedarf 6.1 Prozessqualität 6.1.1 Ökobilanzen über Umweltwirkungsbereiche 6.1.2 Erzeugung von Lebensmitteln (Tierhaltung) 6.1.3 Lebensmittelverarbeitung 6.1.4 Bewertung durch den Verbraucher 6.2 Produktqualität 6.2.1 Produktspezifischer Forschungsbedarf bei pflanzlichen Erzeugnissen 6.2.2 Produktspezifischer Forschungsbedarf für Bio-Lebensmittel 6.2.3 Produktspezifische Qualitätssicherung bei vom Tier stammenden Erzeugnissen 6.3 Komplementäre Methoden der Qualitätserfassung 6.4 Sozioökonomische Aspekte 6.4.1 Ökologisch erzeugte Lebensmittel aus Verbrauchersicht 6.4.2 Auswirkungen von ökologischen Ernährungsstilen auf die Kosten im Gesundheitswesen und auf den Ressourcenverbrauch 6.4.3 Aspekte des Marktes 6.4.4 Ökologische Erzeugnisse in der Gemeinschaftsverpflegung (GV) 6.5 Schlussbetrachtung Anhang 1: Literatur Anhang 2: Begriffserläuterungen/Rechtliche Rahmenbedingungen Anhang 3: Grundlagen des Lebensmittelrechts Anhang 4: Ganzheitlichkeit in der Lebensmittelsmittelforschun

    CreoleVal: Multilingual Multitask Benchmarks for Creoles

    Full text link
    Creoles represent an under-explored and marginalized group of languages, with few available resources for NLP research. While the genealogical ties between Creoles and other highly-resourced languages imply a significant potential for transfer learning, this potential is hampered due to this lack of annotated data. In this work we present CreoleVal, a collection of benchmark datasets spanning 8 different NLP tasks, covering up to 28 Creole languages; it is an aggregate of brand new development datasets for machine comprehension, relation classification, and machine translation for Creoles, in addition to a practical gateway to a handful of preexisting benchmarks. For each benchmark, we conduct baseline experiments in a zero-shot setting in order to further ascertain the capabilities and limitations of transfer learning for Creoles. Ultimately, the goal of CreoleVal is to empower research on Creoles in NLP and computational linguistics. We hope this resource will contribute to technological inclusion for Creole language users around the globe

    CreoleVal: Multilingual Multitask Benchmarks for Creoles

    Get PDF
    Creoles represent an under-explored and marginalized group of languages, with few available resources for NLP research. While the genealogical ties between Creoles and other highly-resourced languages imply a significant potential for transfer learning, this potential is hampered due to this lack of annotated data. In this work we present CreoleVal, a collection of benchmark datasets spanning 8 different NLP tasks, covering up to 28 Creole languages; it is an aggregate of brand new development datasets for machine comprehension, relation classification, and machine translation for Creoles, in addition to a practical gateway to a handful of preexisting benchmarks. For each benchmark, we conduct baseline experiments in a zero-shot setting in order to further ascertain the capabilities and limitations of transfer learning for Creoles. Ultimately, the goal of CreoleVal is to empower research on Creoles in NLP and computational linguistics. We hope this resource will contribute to technological inclusion for Creole language users around the globe

    The Past, Present, and Future of Typological Databases in NLP

    No full text
    Typological information has the potential to be beneficial in the development of NLP models, particularly for low-resource languages. Unfortunately, current large-scale typological databases, notably WALS and Grambank, are inconsistent both with each other and with other sources of typological information, such as linguistic grammars. Some of these inconsistencies stem from coding errors or linguistic variation, but many of the disagreements are due to the discrete categorical nature of these databases. We shed light on this issue by systematically exploring disagreements across typological databases and resources, and their uses in NLP, covering the past and present. We next investigate the future of such work, offering an argument that a continuous view of typological features is clearly beneficial, echoing recommendations from linguistics. We propose that such a view of typology has a significant potential in the future, including language modeling in low-resource scenarios

    A Hybrid Rule-Based and Neural Coreference Resolution System with an Evaluation on Dutch Literature

    No full text
    We introduce a modular, hybrid coreference resolution system that extends a rule-based baseline with three neural classifiers for the subtasks mention detection, mention attributes (gender, animacy, number), and pronoun resolution. The classifiers substantially increase coreference performance in our experiments with Dutch literature across all metrics on the development set: mention detection, LEA, CoNLL, and especially pronoun accuracy. However, on the test set, the best results are obtained with rule-based pronoun resolution. This inconsistent result highlights that the rule-based system is still a strong baseline, and more work is needed to improve pronoun resolution robustly for this dataset. While end-to-end neural systems require no feature engineering and achieve excellent performance in standard benchmarks with large training sets, our simple hybrid system scales well to long document coreference (>10k words) and attains superior results in our experiments on literature

    RUG-1-Pegasussers at SemEval-2022 Task 3: Data Generation Methods to Improve Recognizing Appropriate Taxonomic Word Relations

    No full text
    This paper describes our system created for the SemEval 2022 Task 3: Presupposed Taxonomies - Evaluating Neural-network Semantics. This task is focused on correctly recognizing taxonomic word relations in English, French and Italian. We develop various data generation techniques that expand the originally provided train set and show that all methods increase the performance of models trained on these expanded datasets. Our final system outperforms the baseline from the task organizers by achieving an average macro F1 score of 79.6 on all languages, compared to the baseline's 67.4
    corecore